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ABSTRACT 

We present an algorithm which is designed to allow the efficient identification and 
preliminary dynamical analysis of thousands of structures and substructures in large 
N-body simulations. First we utilise a refined density gradient system (based on Den- 
max) to identify the structures, and then apply an iterative approximate method to 
identify unbound particles, allowing fast calculation of bound substructures. After pro- 
ducing a catalog of separate energetically bound substructures we check to see which of 
these are energetically bound to adjacent substructures. For such bound complex sub- 
halos, we combine components and check if additional free particles are also bound to 
the union, repeating the process iteratively until no further changes are found. Thus 
our subhalos can contain more than one density maximum, but the scheme is sta- 
ble: starting with a small smoothing length initially produces small structures which 
must be combined later, and starting with a large smoothing length produces large 
structures within which sub-substructure is found. We a pply this algorithm to three 
simulations. Two whic h are using the TPM a lgorithm bv iBode et alJ l)2000() and one 
on a simulated halo bv lDiemand et all l)2004(l . For all these halos we find about 5-8% 
of the mass in substructures. 
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1 INTRODUCTION 

Until recently, observational extra-galactic astronomy has 
been based primarily on the study of galaxies and clusters of 
galaxies. The theoretical constructs in the standard ACDM 
paradigm for structure formation which are most closely as- 
sociated with these phenomena are "halos" of dark matter 
and the "subhalos" within them. In this bottom up picture, 
all self gravitating virialised objects are comprised of accu- 
mulated smaller objects, and these latter, hierarchically, of 
still smaller ones ad infinitum, assembled through "merger 
trees". Thus a close examination of any representative ob- 
ject should show the undigested remnant cores of previously 
ingested objects, tidal streamers of debris shredded from 
the outer parts of these same subhalos, and the relatively 
smooth background material which contains the somewhat 
phase mixed accumulation of all the digested tidal effluvia. 
A closer and closer analysis in phase space would allow iden- 
tification of components added at earlier and earlier times. 

Thus "identification of substructure", even if perfect 
tools were available, requires some intellectual precision in 
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the dynamical definitions of what is meant by "subhalos". 
Until recently the lack of sufficiently accurate computations 
made this issue moot, but now investigators have begun this 
analysis, using a variety of defined terms. We will provide 
our own definitions later in this section. 

Historically, it was impossible to produce gala xy-size 



halos in dense clusters w i th dark matter simulations <|Whit 
| l976Uvan KampeiJll995 ISummers et alJll995tlMoore et al 
1996). This was mainly due to the limited mass and force res- 
olution of the simulations used and was commonly known as 
the over-merging problem. The major causes of this problem 
were premature tidal disruption due to inadequate force res- 
olution and two-par ticle evaporation fo r halos with a small 
number of particles i'Kl vpin et alJll999fl . However the com- 
bination of an increase in computing power and the inven- 
tion of more efficient algorithms has led to promising de- 
velopments over the recent years which have overcome the 
numerical problem s ( Ghigna ct al. 199cf |KlY_p jii_et_aI][l99g|; 
[ Moore et alJ Il999t lokamoto fc Habd Il999t iGhiena et alJ 
pOOdlBode et alJl200d ISprineel et al-lboOltbe Lucia et al 



20041: iKravtsov et al] 12003ft . Besides the numerical insuf- 
ficiencies which can destroy substructures, there are also 
physical reasons for the destruction of structure, which are 
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tightly connected to the numerical problems. First, there 
is dynamical friction, which drives the subhalo to the halo 
centre where it can be disrupted and merge with the cen- 
tral object. Second, there is tidal stripping when the tidal 
force from the halo on the subhalo is larger than the grav- 
itational force holding the subhalo together. Furthermore, 
there may be shock heating which occurs during the close 
passage of two subhalos, and more dominantly on passing 
of a subhalo near the halo centre; thi s effect is believed 
to be less prominent than the first tw o llMoore et al.lll993 
iKlvpin et alJll999HGnedin et alJll99^ . 

iKlvoin et, alJ l)l999h estimated that a force softening 
of e = 3h^ 1 kpc and a mass resolution below m p — 
10 9 ft -1 Mq would be sufficient to identify a substructure 
of mass 10 11 ft -1 Mq with at least 30 particles at a distance 
70ft _1 kpc from the centre of a 1O 14 /i -1 M0 cluster. Needless 
to say, higher resolution would be even better. The usual ap- 
proach to obtain such resolution is to take a cluster from a 
cosmological N-body simulation and re-simulate it at higher 
resolution with inclusion of the long distance (tidal) gravita- 
tional fields. However if one wants to address the problem of 
substructure in a statistical and cosmological context, then 
one needs fairly large simulation boxes. Thus one cannot 
currently use, with existing computing power, much higher 
resolution than given above. 

Our goal is to design algorithms that can be used 
to analyze structure/substructure in very large simula- 
tions such as "ligh t cone radians" of the Virgo group 
fevrard et alJ [2002) rather than individual very high res- 
olution simulations of clusters. Besides the noted numer- 
ical difficulties, the identification of structures and sub- 
structures in large N-body simulations is a long stand- 
ing problem of principle. This has been addressed in 
the past by many dif ferent methods, mainly geometrical 
rather than physical llHuchra fc Gellerl Il982t IDavis et alJ 
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exploit to 



some extent the friends-of-friends (FOF'l ([Huchra fc Gelled 
19821: IDavis et al.l Il985t ILacev fc Coll Il994l or the Den- 



max feertschinger fc Gelbll99ll:lGelb fc Bertschingerlll994 
lEisensteTri^^HutT^9^1 ~algorithm (described in Section 0, 
which are also at the centre of our method. What these 
methods have in common is that they are essentially geo- 
metrical and do not use the entire phase space information, 
and hence need post processing to test for bound structures. 
In this paper we discuss a fast approximate method to re- 
move unbound particles from halos. 

Algorithms which have been used fo r finding bound 
structure include SKID (Sta del et al ll997h and hierarchical 
adaptations of it, BD M iKlvpin et al.ll999l) . and SUBFIND 
iSpringel et al] 1200 J) . Quite recen tl y oth er methods have 
been intro duced by [K im fc Park] ]2004) . iNevrinck et al] 
]2004) . and lGill et al] ]2004) . SKID essentially uses the Den- 
max algorithm to identify structures, and then calculates 
bound structures by iteratively removing the unbound parti- 
cle with the largest total e nergy until all parti cles are bound. 
The hierarchical scheme llGhigna et al.ll2000f) uses SKID at 
three different smoothing lengths. The BDM (bound den- 
sity maximum) scheme places spheres of a certain scale r sp 
in the simulation box, and then displaces the spheres to the 



centre of mass of the particles inside the sphere. This pro- 
cess is iterated and eventually all maxima within a sphere 
of size r sp are found. The unbinding is then done by cal- 
culating the escape velocity of the halo from the maximal 
circular velocity; all particles with velocities larger than the 
escape velocity are removed. For the calculation of the es- 
cape velocit y a Navarro- Frenk- W hite (NFW) density profile 
is assumed jNavarro et alill995h . Recently BDM has been 
used to ide ntify a vast number of halos in a large detailed 
simulation (iKravtsov et al]l2003h . The SUBFIND algorithm 
uses the FOF algorithm to find cluster-sized halos, and then 
looks for saddle points in the density field to identify sub- 
halos. Again, the particles of a subhalo are then examined 
to determine if they are bound. Recently 11 re-simulated 
clusters have been ana lysed in great detail with this method 
]De Lucia et al.ll2004) . 



Most of the work quoted above used the re-sampling 
technique and consequently only analysed a small number 
of "typical halos" to high accuracy. Here we take a com- 
plementary approach, by using simulations of volumes con- 
taining many target halos. While sacrificing resolution (as 
compared to the re-sampling technique) we gain in sample 
size by a large factor, with thousands of halos in our largest 
runs. In this paper we will study two quite different simula- 
tions. One contains 256 3 particles in a volume 20/i _1 Mpc on 
a side; the halos from this run have masses typ ical of large 
galaxi es. This run is discussed in more detail in lBode et al.l 
(2001); it was evolved with a P 3 M code, and halted at red- 
shift z=l. The second simulation is of 1024 3 particles with 
box size 320/i -1 Mpc, containing many galaxy cluster sized 
halos. This was evolved to z=0.05 using the T ree-Particle- 
Mesh (TPM) algorithm llBode fc OstrikeJ2003fl . The simu- 
lation parameters can be found in Table One difference 
between the two codes used is that P 3 M uses Plummer, and 
TPM uses spline, softening. 



We will define a subhalo at any level of the hierarchy 
in the following fashion. In the centre of mass frame defined 
by the object in question, we take all particles as members 
which are gravitationally bound (E < 0). Thus, if a small 
smoothing length has been used to identify sub-clumps, we 
check if groups of these are bound to one another and if ad- 
ditional "free" particles are bound to the assemblage. Con- 
versely, if a larger smoothing length has been used to identify 
objects we subsequently analyse these with greater refine- 
ment to ascertain subcomponents which in their own frames 
are self-gravitating. Thus we produce a catalog which pro- 
vides labels for a hierarchy of bound objects, where the cat- 
alog is, to a large extent, independent of the geometrical 
tools used to parse the entire object. We then make an in- 
dependent catalog of the hierarchy, where at each level we 
require all components to be gravitationally bound to the 
object to which they are attached. 



The purpose of this paper is to clearly define the method 
and attempt to carefully specify the algorithms that define 
and identify substructure and to explain how seemingly mi- 
nor variations in procedure can produce large changes in the 
final result. 
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Model 


z 




n c 


n A 


TT- f km / sec 1 
H ° [ Mpc J 


<?8 


n 


JV 


L [Mpc/h] 


m v [M@/h] 


e [kpc/h] 


ACDM 
ACDM 


1.0 
0.05 


0.04 
0.04 


0.26 
0.26 


0.70 
0.70 


67 
70 


0.900 
0.975 


1.0 
1.0 


256 3 
1024 3 


20 
320 


3.97 x 10 7 
2.54 x 10 9 


1.2 
3.2 


VIRGO 






1.0 


0.0 


50 


0.7 




1314161 




8.6 x 10 8 


5,10 



Table 1. In the top two rows the parameters for the two simulations analysed i n this paper, co ld dark matter universes with a cosmological 
constant. In the last row the parameters for the re-simulated cluster by iGhigna et alj|l998l) . 

2 THE METHOD 

Before entering into the details of the method, we present 
a schematic overview of the substructure finding algorithm 
which we will employ. We first apply the FOF method, which 
groups together large structures in a speedy way, on the en- 
tire simulation volume. At the core of our approach is the 
geometrically based Denmax routine by Bertschinger & Gelb 
(1991), which moves particles up density gradients and iden- 
tifies groups as all particles reaching the same density max- 
imum. We run Denmax with high resolution on each FOF 
halo. We then build a family tree and identify, with an it- 
erative approximation scheme, energetically bound particles 
within the structures. 

In this way we create, hierarchically, (a) gravitationally 
bound objects ("mothers"), (b) those substructures which 
lie within a given bound object ("daughters") and are them- 
selves gravitationally bound, and (c) further sub-levels. 



20 



s 



TO 




2.1 Creating the family trees 

The first step to identify large groups in the simulation is 
the application of the FOF routine. We choose as a linking 
length -Rimk = 0.2T2," 1 / 3 , where n~ 1//3 is the mean inter- 
particle separation. This ensures that we find clusters, and 
also trace them out to the virial radius. Furthermore, this 
choice will select groups of particles with over-densities close 
to the value predicted by the spherical collapse model. With 
this linking length, and a minimum number of ten parti- 
cles required to be identified as a group, FOF finds a large 
number of low mass halos and a decreasing number of more 
massive objects. 

We estimate the density at each point of the simulation 
by measuring the weighted volume over the 16 nearest neigh- 
bours of each particle in the simulation (using the SMOOTH 
code; see |http://www-hpcc. astro.washington.edu/tools \ . 
This enables us to estimate the position of the density peak 
of a halo. Also, the smallest rectangular box enclosing each 
halo is found. Next we order the groups according to their 
mass and deploy a bottom-up scheme for identifying which 
groups and particles are within more massive structures. For 
each halo in turn (starting with the least massive), the re- 
mainder of the list is searched to see if the density peak is 
within a box containing a more massive structure; if such 
a box is found then the halo is associated with the more 
massive structure. If a structure already has associated sub- 
structures, they will also belong to the bigger structure. If 
the density peak is not within any other box, the search is 
repeated to check if there is an overlap of the minimum size 
boxes, and if an overlap is found the halo is associated with 
the more massive structure. In this way each structure will 
either belong uniquely to a combined group or be an iso- 



Figure 1. Projection of a simulation with 256 3 particles. The 
boxes are the minimum size boxes of the five most massive halos 
found with FOF with a linking length of Ru n ^ = 1/ [5 v 7 ^] • Note 
that the apparent overlap of the boxes is just a projection effect. 

lated structure. We then calculate the minimal size box of 
each combined group or isolated structure and read in all 
particles inside this box, as long as they have not already 
been identified as belonging to another structure. Note that 
in this way each particle which has been associated with a 
structure by FOF belongs uniquely to a family. However a 
small number of particles which have not been associated to 
a structure by FOF (either by being isolated or belonging 
to a group with less than 10 particles) might belong to more 
than one family; these particles are usually at the margin of 
the family and are not significant for the further analysis. 

In Fig. we show the projection of the simulation with 
256 3 particles in a box of length L = 20/i -1 Mpc at a redshift 
z = 1 and a mass resolution of ~ 4x 10 7 M.o/h. The boxes are 
the minimum size boxes of the five most massive structures 
in the simulation. 

This rough analysis of structure enables one already to 
estimate the mass distribution in the simulation. In Fig. [5] 
we show the mass distribution of families for the 1024 
simulation described in Table Q The dominance of low 
mass objects is clear. Also we show a fit to the slope of 
the distribution with to a genera lised Schechter function 
JPress fc Schechterlll974t ISchechterlll976T) dn h /dln(M h ) = 
N*(M h /M*)- a exp(-M h /M*) and obtain a « 0.9. The fit 
was performed with a nonlinear least-squares Marquadt- 
Levenberg algorithm. Note that at this stage we plot the 
mass function of the families, which makes it harder to com- 
pare with the standard Press-Schechter prescription, which 
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Figure 2. Mass function of clusters with more than 5000 particles 
for the 1024 3 simulation (right). The dotted line is the Schechter 
function with a slope of a = 0.9 and exponential cut-off at a scale 
M„ = 8.0 X 1 14 h~ 1 Wlpi. For com parison we also plot the mass 
function from lEvrard et alJ (2002) (dashed line). 




Figure 3. Projection of all particles in the mother halo (back- 
ground), and the five most massive substructures found by 
the refined Denmax run. The cluster has a total mass of 
1.3 X 10 13 /i — 'Mq and the substructures range between 1.2 X 
10 9 /l _1 Me) and 2.3 X 10 11 ft- 1 Mo for this simulation. 



assumes virial masses and does not take into account the 
linking of overlapping structures , however our findin gs are 
consistent with previous work llGhigna et afll2000l) . The 
dashed li ne in Figure from shows the distribution mea- 
sured bv lEvrard et alJ 1)2002). which establishes that this 
rough catalog agrees well with standard expectations. 

After this first step we have identified large structures 
in the simulation and assigned all particles which potentially 
belong to these structures. This will enable us in the next 
step to refine the analysis within a single family. 



2.2 Identification of substructure and bound 
particles in halos 

We are now in the position to study a single fam- 
ily in more detail. We first perform an identification 
of groups within one family usi n g the Denmax al- 
gorit hm feertschinger fc Gelbl Il99lt iGelb fc BertschingerJ 



1994). Denmax first interpolates the density field p by apply- 
ing a Gaussian kernel with a given smoothing length i? sm0 oth 
to the particle positions. The particles are then shifted along 
the density gradient via the fluid equation 

^ =V S JL. (1) 
dr p 

Each particle moves toward a density maximum where it 
comes to rest, or more probably oscillates around the peak. 
The groups are then identified by using the FOF scheme on 
the shifted particles, with a linking length comparable to 
^smooth- We use a much smaller smoothing length i? sm0 oth 
than the linking length i?ij n k in the FOF scheme used pre- 
viously for finding the rough structures. We take 



/subS , 



(2) 



where e is the softening length of the simulation and / su b is a 
free parameter in our analysis, which we typically choose to 
be / S ub = 5. This choice ensures that we identify the smallest 



structures which are still ab ove th e resolution threshold of 
3I200Q,). We also set the thresh- 



the simulation ( Ghigna ct 
old for the minimum number of particles in a group to 10. In 
this way we obtain a list of groups within the single family. 

After the refined Denmax step there are still particles 
which are not assigned to any group with more than 10 par- 
ticles. For each such particle, we locate the nearest neigh- 
bour structures and calculate the distance Sr to their density 
peak positions. We also calculate the distance to the peak of 
the most massive group, which we call the mother halo. We 
then calculate m/8r 2 , where m is the mass of the neighbour, 
and assign the particle to the group (or the mother) where 
this quantity is maximal. We note that any mis- assignments 
made at this stage will be rectified at a later stage in the 
analysis, and the purpose of this simple criterion is to min- 
imise the necessary amount of reassignment. 

As an example, we show in Fig. 0the five most massive 
substructures identified in the most massive mother halo of 
the 256 3 simulation, which has initially « 321, 000 particles, 
or a mass of 1.3xlO 13 /i _1 M . The masses of all the substruc- 
tures vary between 1.2 X 10 9 /i _1 M© and 2.3 x 1O 11 /i~ 1 M , 
where we assume we can reliably identify a substructure if 
it comprises of at least 30 particles. 

The next step is the build up of the family tree within 
this family. In order to obtain the family tree, we calculate 
the minimum size box which contains each identified sub- 
structure. Then, as before, we apply a bottom-up scheme 
starting with the lowest mass halo and determine if its den- 
sity peak is within the minimal box enclosing a more massive 
structure. The structure with the lowest mass which contains 
the halo is identified to be the mother of this halo, while the 
halo becomes the daughter and hence a substructure of the 
mother. If the density peak is not within any other halo, 
we check if the minimal box is overlapping with any other 
box. In this case we take the lowest mass overlap halo as the 
mother. We then move to the next more massive halo and re- 
peat the procedure. Once we have identified the mother, all 
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Figure 4. Schematic description of the family tree. Halo number 
1 counts all subhalos 2 through 9 as her daughters. Halo 1 is the 
mother of subhalos 2, 3, and 4. Subhalos 5, 6, and 7 are also the 
daughters of subhalo 3, and see 3 as their mother; 7 and 8 are in 
a similar relation to subhalo 2. Number 4 is an isolated subhalo. 



the substructures of the daughter will also become daughters 
of the mother. In this way we obtain a unique mother for 
each halo, and for each mother a list of daughters which con- 
tains all substructures of the hierarchy. We should actually 
talk of daughters, grand-daughters, great grand-daughters 
and so on, but there is no need to distinguish daughters and 
grand-daughters from a mother's point of view, as long as 
each daughter knows who her mother is — which is ensured 
by our procedure. In other words, each mother knows about 
the whole younger generation, but only her mother from 
among her ancestors. "Isolated" substructures will have the 
original mother halo as a mother. In Fig.0]we show schemat- 
ically the build up of a family tree. 

We further introduce a threshold particle number N t . 
Structures with fewer particles than Nt are dissolved into 
their associated mothers. Typically we choose Nt = 30, dis - 
carding smaller groups found earlier. In lGhigna" et alJ (l2000l1 
a threshold of Nt — 16 has been used for using halos as 
tracers, but Nt = 32 for the reliable an alysis of properties of 
halos. However iDiemand et alJ (120041) find that the results 
are only stable for Nt = 100. This is necessary to resolve 
a complete sample of subhalos. In Fig. |5]we show the dis- 
tribution of substructures masses in the most massive halo 
of the 256 3 run, and in the 4th most massive halo from the 
1024 3 simulation. In the following we call these halos clus- 
ter #1 (left) and cluster #4 (right). The entire structure, 
including the mother halo and all daughters, has a mass of 
1.3 x 10 13 fo _1 M Q for cluster #1 and 1.6 x 10 15 /i _1 M Q for 
cluster #4. We clearly see the large number of small struc- 
tures (solid lines); the original distribution follows roughly 
a scaling of dNh/dln(Mh) oc M^ 1A (dotted lines). This be- 
comes even steeper if the halos with less than Nt = 30 par- 
ticles are dissolved into their mothers (dashed lines). The 
Denmax routine had originally recovered about 5100 sub- 
structures which have more than ten particles within clus- 
ter #1, and 7760 halos within cluster #4. About 20-25% of 
the total mass of the structure is in halos with less than 30 
particles as identified by Denmax. After halos with less than 
30 particles have been dissolved into their mothers we are 
left with about 1790 substructures in cluster #1 and 2590 in 
cluster #4. During this procedure the original mother halo 
gained 5.7 x 1O 11 /i" 1 M for cluster #1 and 6.4 x 10 13 /i _1 M Q 



for halo #4. The rest of the mass is distributed among the 
lower mass halos, as seen in the dashed histograms in Fig. [5] 
As mentioned in the introduction, Denmax itself has 
been applied in a hierarchical way, either as part of SKID 
by applying three different sm oothing lengths Z]j n k = 
1.5, 5, lOLjoft llGhigna et al-lEoOoh . or by using it on larger 
scales with 7? sm0 oth = O ^n" 1 ^ 3 and re-analysi ng each halo 
with -Rsmooth = O.ln - 1/3 (iNevrinck et alj|2004h . The reason 
for this is that in general there is no single smoothing length 
which is suited to find structures over a large mass range in 
the simulation. If the smoothing length is too large then 
small structures are not resolved, and if it is too small then 
large structures are broken up. We choose a small smooth- 
ing length and recombine larger objects using the family tree 
hierarchy. 

We have now a clearly defined, geometrically based pic- 
ture of substructures, which we can proceed to analyse in a 
more physical fashion so that unbound particles are culled 
out. In some situations the Denmax procedure may err in 
assigning some particles to substructures. Imagine a particle 
which is dynamically a part of the mother halo: the Denmax 
algorithm will move this particle toward the cluster centre, 
but if a significant substructure just happens to intervene, 
the particle will reach this local maximum and stop. Thus 
there will be particles extending in a radial wedge outside of 
any bound structure arbitrarily attached to it, even if they 
are gravitationally not bound to it. To correct for such un- 
physical identifications, we need now a post-identification 
dynamical treatment of the halos. 



2.2.1 Velocity outliers 

It can be shown feinnev fc Tremaind Il987h that the rms 
escape velocity from a finite, bound self-gravitating system 
is related to the rms velocity by (ytse) = 4«r ms . Thus par- 
ticles having a velocity greater than y 2 (v| sc ) = VSvims 
are very unlikely to be bound to the structure. One way to 
calculate the escape velocity is by measuri ng the ma ximum 
value of the circular velocity v c i IC (r) = -J GM(r)/r; by as- 
suming an NFW profile th is can be related to the escape 
velocity jKlvpin et al.lfl999r> . However this method relies on 
the NFW profile which we do not want to assume at this 
stage. 

To remove unbound particles from a substructure, we 
will instead proceed with a first approximation by calculat- 
ing the typical rms velocity and removing particles which 
are statistical outliers. But we cannot calculate the velocity 
dispersion until we know the true centre of mass (CoM) ve- 
locity, so — because we have not removed unbound particles 
from the structure — we must proceed iteratively, beginning 
with an approximation for the CoM. We choose the den- 
sity peak of the substructure (not including its daughters) 
as a first approximation to the CoM. In order to obtain the 
CoM velocity we calculate the median of the velocity of the 
N v = 100 nearest neighbours to the density peak within the 
structure. If the number of particles is less than 30, we take 
half the particles of the structure. In order to obtain a valid 
answer we must pay attention to binaries, which could bias 
the result to large velocities. Hence we identify binaries by 
searching the whole simulation for bound pairs. We so far 
have not found bound pairs of particles in all the simulations 



© 2003 RAS, MNRAS 000.HHTT1 



6 Jochen Weller, Jeremiah P Ostriker, Paul Bode and Laurie Shaw 




M„/[lO»M h->] M h /[10'°M o h-'] 



Figure 5. Left: Mass function of the substructures of the most massive halo of the 256 3 simulation, before (solid line) and after (dashed 
line) structures with less than jVt = 30 have been dissolved into their mothers. The mother halo has PS 160, 000 particles or a mass of 
6.5 X 10 12 h~ 1 Mq . The dotted line is from a power law dN^/dln^M^) <x Af^ -1 ' 4 . Right: The same plot for the 4th most massive halo of 
the 1024 3 simulation with 392,000 particles and a mass of 1.1 X 10 15 /i Mq in the mother halo. 



we studied, which also provides evidence that the simulation 
is not over-resolved. If we did find a bound pair, the two par- 
ticles would be replaced by a single particle with twice the 
mass, and the CoM position and CoM velocity of the pair. 
This ensures that we do not encounter velocity biases due 
to binaries. We then can proceed to calculate the rms ve- 
locity Vr ma = ^(vp ar t — v cm ) 2 ) for the N v = 100 particles 
around the density peak. All particles in the substructure 
which have a velocity 

v part > fcutV ImB , (3) 

are then removed and added to their associated mother 
structure. We iterate this process until the mass change of 
the substructure is less then 5%. We perform this velocity 
cut at two levels: first we use / cut = 8 as noted earlier, and 
as mentioned above N v — 100 particles for the CoM velocity 
and rms velocity calculation. Then choosing a tighter limit 
with / C ut = 6, we find the centre of mass mean velocity 
and velocity dispersion of the inner half of the particles and 
repeat the process. 

In Fig.|S|we show the velocity distributions of the parti- 
cles in the most massive substructure (solid line) in clusters 
#1 and #4. We also show the threshold rms velocity (dotted 
line) during the first step of the iteration. All particles above 
this threshold are moved to the associated mother. The mass 
of the mother halo at the end of this procedure increases just 
by 4.2 x W 11 h- 1 M Q for cluster #1 and 2.0 x W 13 h- 1 M Q for 
cluster #4, where most of the change occurs during the first 
cut-off scheme. After the removal of the velocity outliers we 
again dissolve halos with less than N t = 30 particles into 
their mothers. 

At the end of this step we recalculate the CoM and w rma 
for the subhalo and then we move daughters which have a 
faster CoM velocity than \ // 6v Iuls to the associated mother 
of the substructure under consideration. 



2.2.2 Tree calculation of potentials and bound particles 

We now reach the step where we can remove particles which 
have a total energy larger than zero in the centre of mass 
frame of the structure to which they belong. We will check 
within each substructure which particles are bound to it. We 
calculate the CoM of a substructure including all its daugh- 
ters and compute the potential <j) °f the particles within 
the substructure. The potential calculation is done using an 
adaption of a tree code by Hernquist (1987). Note that we 
switch to an exact direct summation of of the potential en- 
ergy if there are less than 100 particles in the system. The 
total energy of a particle is then 

1 2 

-Etot = m4> + -m ( Vp ar t - v cm ) , (4) 

where m is the mass of a particle, <f> the potential from all 
the other masses within the substructure, and v cm the CoM 
velocity of the substructure. We calculate E to t for each parti- 
cle and then remove the third of the unbound particles with 
the highest energies, moving them to their associated mother 
structure. Note that we choose only a third of the particles 
because otherwise particles are removed to quickly without 
taking into account that the CoM velocity, and hence the ki- 
netic energy, is changing with each removed particle. Ideally 
one shoul d remove only one particle at a time, as it is done 
in SKID iStadel et al.lll997l) . but this is too time consuming 
for hundreds of halos with over 10 particles. We tested dif- 
ferent fractions and observed that one third was the largest 
number which results in a stable result. We then recalculate 
the CoM and iterate this step until there is no change in the 
mass of the system. 

In Fig. |7]we show the mass distribution before and af- 
ter unbound particles have been moved to the mother struc- 
tures. Note that all daughters with less than JVt = 30 parti- 
cles have been dissolved into their mothers. There are many 
unbound particles in the substructures which returned to 
the original mother. The mother in halo #1 now has a mass 
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Figure 6. Distribution of the particle velocities in the most massive substructure, the velocity threshold 8fj? ms (dotted) and the rms 
velocity (dashed). All the particles above the threshold are moved to the associated mother. Cluster #1 is on the left, and #4 on the 
right. 




Figure 7. Mass function of substructure in the test clusters #1 (left) and #4 (right) after particles with high velocities have been moved 
to the associated mothers (solid line) and after unbound particles have been moved to the mother structure (dashed line). Note that the 
power laws are now dN h /dln{M h ) oc M~ 1X) for #1 and dN h /dln(M h ) oc M" ' 8 for #4. 



of 1.2 x lO 13 ft _1 M which corresponds to « 293,000 par- 
ticles; there are now only 134 daughters with a total mass 
of 1.1 x 1O 12 /i _1 M . The mother of halo #4 has a mass of 
1.5xlO 15 h _1 M or 582,000 particles, with 1.2 x lO 14 h _1 M 
remaining in 106 daughters. 

Before we proceed with the next step we will remove any 
daughter which is not bound to its mother. We approximate 
the potential energy for the daughters by 



P d - 

^pot — 



-G 



m d M{r d ) 



G 



— dM(r) , 
r 



(•>) 



where ma is the daughter mass, ra is the distance of the 
CoM of the daughter to the CoM of its mother, and M(r) 
is the total mass of the mother within radius r including all 
other daughters. We then can calculate the kinetic energy of 



the daughter with respect to the centre of mass its mother. 
If the daughter is not bound to her mother we move her to 
the mother of the mother. 



2.2.3 Search for Hyper-structures 

In order to obtain a stable algorithm with respect to the 
smoothing length for the refined Denmax procedure, we 
need, as noted above, to look for "hyper-structures" , groups 
of substructures which are gravitationally collectively bound 
to one another. 

This problem has been addressed previously by com- 
bin ing the SKID algorit hm with an adaptive FOF analy- 
sis iDiemand et alJ l2004): we will take a different approach 
here. In order to do this we investigate primary substruc- 
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tures, ie. structures which are direct daughters of the largest 
structure which is the mother structure. For each such pri- 
mary substructure, we calculate the distance <5r; to each 
other primary substructure with mass mi, and examine the 
one with the maximal rrii/Sr^ as follows; note that the 
masses include all daughters of the primary substructures. If 
these two structures are bound with respect to their common 
CoM, they form a hyper-structure; the less massive of the 
two becomes a daughter of the more massive structure. We 
than re-calculate the CoM and the maximum extension box 
of the new hyper-structure, and check each particle of the 
mother within this box. If it is bound to the hyper-structure, 
we then move it from the mother to this hyper-structure. We 
will iterate this step three times. Note that for both halos the 
mass in the mother structure does not change significantly 
during this step. 

In this fashion bound objects, whose identity is indepen- 
dent of the geometrical tool used to analyse substructure, 
are assembled. 



2.2-4 Final Steps: Daughters and Particles unbound to 
Entire Family 

The next step we perform is to remove daughters which are 
not bound to the biggest structure, the mother halo. And 
finally we remove particles which are not bound to the family 
tree at all. For both halos none of the daughters is unbound 
and the number of unbound particles is negligible. 

In Fig. |H| we show the mass distribution from the re- 
fined Denmax run (solid line) and after the unbinding steps 
(dashed line). There remains 8.8% of the mass in sub- 
structures for #1 (left) and 7.4% for halo #4 (right). For 
cluster #1 dNhfdln(Mh) oc M^ 1 ' 1 for small mass halos, 
the distribution in cluster #4 is roughly approximated by 
dNhfdln(Mh) oc M^~ ' 8 . Note these are only rough power 
laws. 



2.2.5 Truncation of Halo at the "virial radius" and 
Identification of Companions 

Now we check if we have artificially linked together separate 
structures which are only weakly coupled together gravita- 
tionally, and if we have artificially included distant in-falling 
matter. For a ACDM cosmology, it is conventional to define 
the virial mass M v i r and radius -Rvir with 

M vir = -TtI$ it A c {z)pc(z) , (6) 

where p c is the critical density of the universe, and the mean 
over-density A c = 178r2 m (z) ' 45 . Thus we make a rank or- 
dered list of our mother halos, and in each one we start at the 
density maximum and proceed outward until we reach the 
virial radius, within which the mean over-density is A c . We 
truncate the halo at this point, removing all particles from 
outside the virial radius of the halo and identifying daugh- 
ters with centres outside this radius as separate companion 
structures. In this step the mass of the mother halo in clus- 
ter #1 stays almost constant at 1.1 x 1O 13 /i _1 M0 while 96 
subhalos with a total mass of 9.4 x lO n /i _1 M remain. In 
cluster #4 the mother mass is reduced to 1.2 x 1O 15 /i _1 M 
with 65 remaining subhalos of a mass of 8.2 x 1O 13 /i _1 M . 



In Fig. |5|we show the projected density inside the virial 
radius of cluster #1 (left) and cluster #4 (right), with the 
daughters marked. Note that cluster #1 is at a redshift of 
z = 1 while cluster #4 at redshift z = 0.05. Halo #1 has 
a virial radius of r v i r « 0.81 Mpc and halo #4 a radius of 
r v i r w 3.0 Mpc. Note that for halo #4 the substructure is 
much more centered around the core than for halo #1. 



3 DEPENDENCE ON PARAMETERS AND 
TEST OF STABILITY 

In this section we discuss the stability of our method, as 
different possible choices of the parameters and procedures 
may affect the results. 



3.1 Group Finding 

We will first vary the linking length in our rough initial FOF 
analysis in order to establish how sensitive we are to this 
parameter choice. We perform an analysis with a linking 
length of -Rn n k = 0.167n _1 , this could potentially lead 
to a larger fragmentation of initial halos and families and 
hence potentially change our results. We find that the results 
of this run are almost identical with results obtained with 
the original linking length. For halo #4 we have after the 
initial fine Denmax run 35% of the mass in substructures 
(compared to 34% in the original run) which is lowered to 
9% (8%) after we test for bound particles; after the final 
virial cut, 8% of the halo mass in substructures while the 
original run resulted in slightly lower than 8%. This is due to 
the fact that our family tree procedure followed by a refined 
Denmax run produces almost the same large structures. 

We did a further consistency check where instead of 
FOF we used a rough Denmax run with a smoothing length 
of -Rsmooth = l/5n~ x / 3 to identify the initial halo list. The 
results were essentially the same as in the original runs. 
Hence we conclude that our method is stable with respect 
to sensible changes in the initial halo finding algorithm to 
within ±1%, which is well below the statistical fluctuation 
of the sample. 

The next halo finding step we perform is the refined 
Denmax run. We crucially chose in this step the smallest 
sensible smoothing length and then built up the halo hier- 
archy by our family tree algorithm. Making the smoothing 
length smaller than -Rsmooth = 5e would enter the regime 
dominated by uncertainties in the force softening, so we do 
not extend a stability test in this direction. 

Instead we repeated the analysis with a smoothing 
length of -Rsmooth = 10c. Due to the larger smoothing length, 
we find 56% of the mass in substructures after the initial 
denmax step for halo #4; however, when we test for parti- 
cles which are actually bound to these structures we obtain 
already 9% of the mass in substructures. After the inclusion 
of hyper-structures and the density cut this drops to 7%, 
which is in excellent agreement compared to the run with a 
smoothing length -Rsmooth = 5e (8%). 

We hence conclude that we have a reasonably stable cri- 
terion if the smoothing length is chosen within a reasonable 
range. Of course, as the smoothing length is made larger we 
will miss more and more structures. 
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Figure 8. Mass function of substructures in the halo after the refined Denmax run and the mass cut (solid), and at the end of the the 
iterative scheme (dashed). The dotted lines show the power laws dN h /dln{M h ) oc M" 1 ' 1 (left) and M" ' 8 (right). 



2.0 



Surface density in 10" h W./Wpc z 



2.0 2.7 3.4 




2-0 




Figure 9. Particles and daughters in halo # 1 (left) and halo # 4 (right) after the cleanup procedure and removal of particles and 
subhalos outside the virial radius. The dark cross marks the centre of the halo and the boxes the identified and bound daughters. The 
colour corresponds to the surface density as indicated by the colour bar. Cluster #1 has 96 subhalos and #4 has 65. 



3.2 Removal of unbound particles 

The first step of removing unbound particles is performed by 
removing velocity outliers in a gentle way. Since we do this 
already in two steps with first a gentler and then a harder 
cut-off at 8«r ms and 6v^ ms we established that most of the 
cut is happening during the first iteration step. However 
the velocity cut does not change the mass fraction signifi- 
cantly. Final results do not depend on the specific numbers 
(8,6) x «r ms , as long as we approach the final cut gradu- 
ally. Furthermore, we note that this cut was mainly done to 
avoid an unphysical bias toward large CoM velocities, which 
is important for the calculation of the kinetic energies with 
respect to the centre of mass. 



lEvrard et all (l2002h : lwhitel (I2002I) ). we will investigate how 
this definition influences our results. We chose initially the 
virial mass and radius corresponding in a ACDM cosmology 
to the over- density A c {z) = 178Q m (z) 0A5 . We saw already 
in the comparison of the 256 3 and 1024 3 simulations that, 
after this density cut, the fraction of mass in substructures 
can be quite different. However, the simulation with 256 3 
was also at a redshift of z = 1, compared to z — for the 
1024 3 simulation. Hence we performed an analysis of the 
the 1024 3 run where we chose the cut-off over-density to be 
A c = 200 in agreement with another commonly used defi- 
nition. With this cut-off the final mass-fraction in subhalos 
only decreases from 8% to 7% for halo #4. 



3.3 Virial Cut 

Since the definition of a mass or siz e of a halo is t o som e 
extent arbitrary (see for example: Ijenkins et al.l i200ll) : 



© 2003 RAS, MNRAS 000, EE] 



10 Jochen Weller, Jeremiah P Ostriker, Paul Bode and Laurie Shaw 




O.OOOl 0.001 



Figure 10. Cumulative mass function for sub structures with 
more than 30 particles in halo D6 as presented bv lDiemand et all 
J2004t) . The soli d line is from the algo rithm presented here, the 
dotted line from iDiemand et all2004h and the dashed line has a 
slope of M' 1 . 



takes about 8 hours on a SUN BLADE 2000 on a single 900 
MHz processor with 3 Gbyte RAM. We established an ap- 
proximate method to identify and remove unbound particles 
from subhalos, which allows for the efficient calculation of 
bound structures. We analysed three simulations, two done 
by the TPM code develo ped bv lBode et"afl fcOOOl) and one 
bv lDiemand et ail <l2004f) . For all three we find similar mass 
fractions of about 5-8%. iDiemand et al.ll2004) find about 
5% of the mass in substructures, which is identical with our 
findings. 

The fraction of substructure in a cold dark matter clus- 
ter is to some extend a question of definition. If one for 
instance is interested in strong lensing, which tests the dis- 
tribution of matter, the question of bound or unbound struc- 
tures is irrelevant. However, if substructures are the places 
where galaxies form, a full dynamical treatment is relevant 
and requires the inclusion of all forces, including the ones 
from internal and external potential. 

To conclude we emphasize that the presented algorithm 
is stable and fast and ready to be employed for large cosmo- 
logical data sets as well as detailed simulations of clusters 
of galaxies. 



4 APPLICATION TO A DIFFERENT 
SIMULATION 

In order to test o ur algorithm we appli ed it to a simu- 
lation provided bv iDiemand et ail l|2004h . We chose their 
cluster D6, where the simulation was done with a smooth- 
ing length of e = 3.6 k pc, which is comparable to our runs. 
IDiemand et, alJ (|2004h state that their halo finding is com- 
plete for halos with more than 100 particles and they find 
about 5% of the mass in substructures. We also obtain with 
our scheme 5% of the mass in substructures, which is excel- 
lent agreement given the difference of the analysis methods. 
IDiemand et alJ i2004h use a hierarchical version of the SKID 
algorithm which is based on DENMAX. They perform the 
unbinding iteratively and exactly with no approximation like 
the one discussed in Section l2,2l 

In order to get a further inside into the statistics of sub- 
structures we compare the cumulative massf unctions of halo 
D6 fr om our analysis and the analysis by IDiemand et alJ 
(2004). In Fig. 1101 we show the cumulative mass function 
for substructures in the test halo D6. The solid line is from 
our analysis and the dashed line from iDiemand et al.l2004l) . 
They look both very similar and scale very closely to M _1 
until a cut-off at less than a thousandth of the total clus- 
ter mass. Our a nalysis results in sligh tly more substructures 
than the one of iDiemand et alJ2004l) . We find 272 substruc- 
ture while they find 241. This is actually strikingly similar 
given the difference of the presented algorithms and the over- 
all mass fraction in substructures for both analysis is at the 
5% level within 1% uncertainty. 



5 CONCLUSION 

In this paper we have established a fast and stable algo- 
rithm to identify vast numbers of substructures in large 
N-body simulations in a speedy fashion. For example to 
analyse the most massive halo of about 1.5 million particles 
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